On evaluating synthesised visual speech

نویسندگان

  • Barry-John Theobald
  • Nicholas Wilkinson
  • Iain A. Matthews
چکیده

This paper describes issues relating to the subjective evaluation of synthesised visual speech. Two approaches to synthesis are compared: a text-driven synthesiser and a speech-driven synthesiser. Both synthesisers are trained using the same data and both use the same model for rendering the synthesised visual speech. Naturalness is used as a performance metric, and the naturalness of real visual speech re-rendered on the same model is used as a benchmark. The naturalness of the textdriven synthesiser is significantly better than the speech-driven synthesiser, but neither synthesiser can yet achieve the naturalness of real visual speech. The impact of likely sources of error apparent in the synthesised visual speech is investigated. Similar forms of error are introduced into real visual speech sequences and the degradation in naturalness is measured using the same naturalness ratings used to evaluate the performance of the synthesisers. We find that the overall perception of sentencelevel utterances is severely degraded when only a small region of an otherwise perfect rendering of the visual sequence is incorrect. For example, if the visual gesture for only a single syllable in an utterance is incorrect, the overall naturalness of this real sequence is rated lower than the text-based synthesiser.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Animation of a Hierarchical Appearance Based Facial Model and Perceptual Analysis of Visual Speech

In this Thesis a hierarchical image-based 2D talking head model is presented, together with robust automatic and semi-automatic animation techniques, and a novel perceptual method for evaluating visual-speech based on the McGurk effect. The novelty of the hierarchical facial model stems from the fact that sub-facial areas are modelled individually. To produce a facial animation, animations for ...

متن کامل

Evaluating comprehension of natural and synthetic conversational speech

Current speech synthesis methods typically operate on isolated sentences and lack convincing prosody when generating longer segments of speech. Similarly, prevailing TTS evaluation paradigms, such as intelligibility (transcription word error rate) or MOS, only score sentences in isolation, even though overall comprehension arguably is more important for speech-based communication. In an effort ...

متن کامل

A probabilistic trajectory synthesis system for synthesising visual speech

We describe an unsupervised probabilistic approach for synthesising visual speech from audio. Acoustic features representing a training corpus are clustered and the probability density function (PDF) of each cluster is modelled as a Gaussian mixture model (GMM). A visual target in the form of a shortterm parameter trajectory is generated for each cluster. Synthesis involves combining the cluste...

متن کامل

Image-based Talking Heads using Radial Basis Functions

In recent years talking heads have received a great deal of interest, both in their application to natural humancomputer dialogue, and their benefit to the intelligibility of synthesised speech. A model for the realistic synthesis of visual speech animation is described in this paper. Images representing the key visual speech poses (visemes) are pre-recorded and labelled. Transitions between vi...

متن کامل

Refinement of lip shape in sign speech synthesis

This paper deals with an analysis of lip shapes during speech that accompanies sign language, referred to as sign speech. A new sign speech database is collected and a new framework for the analysis of mouth patterns is introduced. Using a shape model restricted to the outer lip contour, we show that the articulatory parameters for visual speech alone are not sufficient for representing sign sp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008